Skip to content

Add troubleshooting guide for certificate re-issuance loops#2128

Open
wallrj-cyberark wants to merge 1 commit into
cert-manager:masterfrom
wallrj-cyberark:VC-54385-troubleshoot-reissuance-loop
Open

Add troubleshooting guide for certificate re-issuance loops#2128
wallrj-cyberark wants to merge 1 commit into
cert-manager:masterfrom
wallrj-cyberark:VC-54385-troubleshoot-reissuance-loop

Conversation

@wallrj-cyberark

@wallrj-cyberark wallrj-cyberark commented Jun 5, 2026

Copy link
Copy Markdown
Member

Preview: https://deploy-preview-2128--cert-manager.netlify.app/docs/troubleshooting/certificate-reissuing-loop/

Summary

Motivation

A customer reported an infinite re-issuance loop caused by External Secrets
Operator overwriting the target Secret. The investigation revealed this is a
known class of problem with no existing documentation. This page fills that gap,
helping users diagnose and resolve the issue without needing to search through
GitHub issues.

Test plan

  • npm run check:spelling passes (added externalsecret to .spelling)
  • npm run check:markdown (remark) passes
  • npm run check:eslint passes
  • npm run check:stylelint passes
  • manifest.json validates as correct JSON
  • Verify the Netlify preview renders the page correctly

Generated with Claude (Opus 4.6)

@cert-manager-prow cert-manager-prow Bot added dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 5, 2026
@cert-manager-prow

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign erikgb for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify

netlify Bot commented Jun 5, 2026

Copy link
Copy Markdown

Deploy Preview for cert-manager ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 20bba20
🔍 Latest deploy log https://app.netlify.com/projects/cert-manager/deploys/6a22f83ec209c600080af1ab
😎 Deploy Preview https://deploy-preview-2128--cert-manager.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@wallrj-cyberark wallrj-cyberark left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review of the new troubleshooting page.

Scope: Four files changed — all documentation, no code. New troubleshooting page covering two main scenarios (external Secret manager conflicts, duplicate spec.secretName), plus navigation updates in manifest.json and the troubleshooting index.

Checks: check:eslint, check:stylelint, check:markdown (remark), and check:spelling all pass. The check:links failures are pre-existing (mailto: example addresses in ingress.md, gateway.md, annotations.md).

Netlify preview: the page renders correctly with admonitions, code blocks, and the table of contents sidebar.

Generated with Claude (Opus 4.6)

1. cert-manager successfully issues a certificate and writes the key material
to the target Secret.
2. The external controller overwrites or patches the Secret (e.g. on its next
sync interval), replacing the private key and/or certificate data.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SecretPublicKeyDiffersFromCurrentCertificateRequest function name is taken directly from the cert-manager source at internal/controller/certificates/policies/checks.go:189. Including it here helps users who search logs or source for this identifier.

sync interval), replacing the private key and/or certificate data.
3. cert-manager detects that the private key in the Secret no longer matches
the CSR in the current CertificateRequest
(`SecretPublicKeyDiffersFromCurrentCertificateRequest`).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key insight for this scenario: backoff counters reset on successful issuance, so the loop runs without delay. This is not a bug — it is correct behaviour — but it means that the external write triggers unbounded re-issuance. The shouldBackoffReissuingOnFailure function in pkg/controller/certificates/trigger/trigger_controller.go:287 only backs off when status.lastFailureTime is set.

### Diagnosis

Check for duplicate `spec.secretName` values across Certificate resources:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uniq -d -f1 approach finds duplicates by the second tab-separated field (secretName). This will correctly flag cases where two Certificates in different namespaces share a secretName, though those cannot actually conflict (Secrets are namespace-scoped). A reviewer might suggest scoping this to per-namespace duplicates only, but the broader output is arguably more useful as a diagnostic aid.

- Document the root cause of infinite re-issuance loops when an
  external controller (e.g. External Secrets Operator) overwrites
  a cert-manager-managed Secret
- Document the duplicate spec.secretName variant where two
  Certificate resources target the same Secret
- Include symptoms, diagnosis steps, and fixes for each scenario
- Reference GitHub issues #4846, #5675, #6988, #6992, #8380

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Richard Wall <richard.wall@cyberark.com>
@wallrj-cyberark wallrj-cyberark force-pushed the VC-54385-troubleshoot-reissuance-loop branch from 888ab6c to 20bba20 Compare June 5, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant